Machine Translation on the Medical Domain: The Role of BLEU/NIST and METEOR in a Controlled Vocabulary Setting
نویسندگان
چکیده
The main objective of our project is to extract clinical information from thoracic radiology reports in Portuguese using Machine Translation (MT) and cross language information retrieval techniques. To accomplish this task we need to evaluate the involved machine translation system. Since human MT evaluation is costly and time consuming we opted to use automated methods. We propose an evaluation methodology using NIST/BLEU and METEOR algorithms and a controlled medical vocabulary, the Unified Medical Language System (UMLS). A set of documents are generated and they are either machine translated or used as evaluation references. This methodology is used to evaluate the performance of our specialized Portuguese English translation dictionary. A significant improvement on evaluation scores after the dictionary incorporation into a commercial MT system is demonstrated. The use of UMLS and automated MT evaluation techniques can help the development of applications on the medical domain. Our methodology can also be used on general MT research for evaluating and testing purposes.
منابع مشابه
Meteor, m-bleu and m-ter: Flexible Matching and Parameter Tuning for High-Correlation with Human Judgments of Machine Translation Quality
We describe our submission to the NIST Metrics for Machine Translation Challenge consisting of 4 metrics two versions of meteor, m-bleu and m-ter. We first give a brief description of Meteor . That is followed by descriptino of m-bleu and m-ter, enhanced versions of two other widely used metrics bleu and ter, which extend the exact word matching used in these metrics with the flexible matching ...
متن کاملThe TALP-UPC Phrase-Based Translation Systems for WMT13: System Combination with Morphology Generation, Domain Adaptation and Corpus Filtering
This paper describes the TALP participation in the WMT13 evaluation campaign. Our participation is based on the combination of several statistical machine translation systems: based on standard phrasebased Moses systems. Variations include techniques such as morphology generation, training sentence filtering, and domain adaptation through unit derivation. The results show a coherent improvement...
متن کاملPolish to English Statistical Machine Translation
This research explores the effects of various training settings on a Polish to English Statistical Machine Translation system for spoken language. Various elements of the TED, Europarl, and OPUS parallel text corpora were used as the basis for training of language models, for development, tuning and testing of the translation system. The BLEU, NIST, METEOR and TER metrics were used to evaluate ...
متن کاملA Walk on the Other Side: Adding Statistical Components to a Transfer-Based Translation System
This paper seeks to complement the current trend of adding more structure to Statistical Machine Translation systems, by exploring the opposite direction: adding statistical components to a Transfer-Based MT system. Initial results on the BTEC data show significant improvement according to three automatic evaluation metrics (BLEU, NIST and METEOR).
متن کاملA Walk on the Other Side: Using SMT Components in a Transfer-Based Translation System
This paper seeks to complement the current trend of adding more structure to Statistical Machine Translation systems, by exploring the opposite direction: adding statistical components to a Transfer-Based MT system. Initial results on the BTEC data show significant improvement according to three automatic evaluation metrics (BLEU, NIST and METEOR).
متن کامل